40 research outputs found
Detecting portuguese and english Twitter users’ gender
Existing social networking services provide means for people to communicate and express
their feelings in a easy way. Such user generated content contains clues of user’s behaviors and
preferences, as well as other metadata information that is now available for scientific research.
Twitter, in particular, has become a relevant source for social networking studies, mainly because:
it provides a simple way for users to express their feelings, ideas, and opinions; makes
the user generated content and associated metadata available to the community; and furthermore
provides easy-to-use web interfaces and application programming interfaces (API) to access
data. For many studies, the available information about a user is relevant. However, the gender
attribute is not provided when creating a Twitter account.
The main focus of this study is to infer the users’ gender from other available information.
We propose a methodology for gender detection of Twitter users, using unstructured information
found on Twitter profile, user generated content, and later using the user’s profile picture.
In previous studies, one of the challenges presented was the labor-intensive task of manually
labelling datasets. In this study, we propose a method for creating extended labelled datasets in
a semi-automatic fashion. With the extended labelled datasets, we associate the users’ textual
content with their gender and created gender models, based on the users’ generated content and
profile information. We explore supervised and unsupervised classifiers and evaluate the results
in both Portuguese and English Twitter user datasets. We obtained an accuracy of 93.2% with
English users and an accuracy of 96.9% with Portuguese users. The proposed methodology of
our research is language independent, but our focus was given to Portuguese and English users.Os serviços de redes sociais existentes proporcionam meios para as pessoas comunicarem
e exprimirem os seus sentimentos de uma forma fácil. O conteúdo gerado por estes utilizadores
contém indícios dos seus comportamentos e preferências, bem como outros metadados que estão
agora disponíveis para investigação científica. O Twitter em particular, tornou-se uma fonte
importante para estudos das redes socias, sobretudo porque fornece um modo simples para os
utilizadores expressarem os seus sentimentos, ideias e opiniões; disponibiliza o conteúdo gerado
pelos utilizadores e os metadados associados à comunidade; e fornece interfaces web e interfaces
de programação de aplicações (API) para acesso aos dados de fácil utilização. Para muitos
estudos, a informação disponível sobre um utilizador é relevante. No entanto, o atributo de
género não é fornecido ao criar uma conta no Twitter.
O foco principal deste estudo é inferir o género dos utilizadores através da informação
disponível. Propomos uma metodologia para a detecção de género de utilizadores do Twitter,
usando informação não estruturada encontrada no perfil do Twitter, no conteúdo gerado pelo
utilizador, e mais tarde usando a imagem de perfil do utilizador. Em estudos anteriores, um dos
desafios apresentados foi a tarefa de etiquetar manualmente dados, que revelou exigir bastante
trabalho. Neste estudo, propomos um método para a criação de conjuntos de dados etiquetados
de uma forma semi-automática, utilizando um conjunto de atributos com base na informação
não estruturada de perfil. Utilizando os conjuntos de dados etiquetados, associamos conteúdo
textual ao seu género e criamos modelos, com base no conteúdo gerado pelos utilizadores, e
na informação de perfil. Exploramos classificadores supervisionados e não supervisionados e
avaliamos os resultados em ambos os conjuntos de dados de utilizadores Portugueses e Ingleses
do Twitter. Obtivemos uma precisão de 93,2% com utilizadores Ingleses e uma precisão de
96,9% com utilizadores Portugueses. A metodologia proposta é independente do idioma, mas
o foco foi dado a utilizadores Portugueses e Ingleses
Building a Portuguese Coalition for Biodiversity Genomics
The diverse physiography of the Portuguese land and marine territory, spanning from continental Europe to the Atlantic archipelagos, has made it an important repository of biodiversity throughout the Pleistocene glacial cycles, leading to a remarkable diversity of species and ecosystems. This rich biodiversity is under threat from anthropogenic drivers, such as climate change, invasive species, land use changes, overexploitation or pathogen (re)emergence. The inventory, characterization and study of biodiversity at inter- and intra-specific levels using genomics is crucial to promote its preservation and recovery by informing biodiversity conservation policies, management measures and research. The participation of researchers from Portuguese institutions in the European Reference Genome Atlas (ERGA) initiative, and its pilot effort to generate reference genomes for European biodiversity, has reinforced the establishment of Biogenome Portugal. This nascent institutional network will connect the national community of researchers in genomics. Here, we describe the Portuguese contribution to ERGA’s pilot effort, which will generate high-quality reference genomes of six species from Portugal that are endemic, iconic and/or endangered, and include plants, insects and vertebrates (fish, birds and mammals) from mainland Portugal or the Azores islands. In addition, we outline the objectives of Biogenome Portugal, which aims to (i) promote scientific collaboration, (ii) contribute to advanced training, (iii) stimulate the participation of institutions and researchers based in Portugal in international biodiversity genomics initiatives, and (iv) contribute to the transfer of knowledge to stakeholders and engaging the public to preserve biodiversity. This initiative will strengthen biodiversity genomics research in Portugal and fuel the genomic inventory of Portuguese eukaryotic species. Such efforts will be critical to the conservation of the country’s rich biodiversity and will contribute to ERGA’s goal of generating reference genomes for European species.info:eu-repo/semantics/publishedVersio
The European Reference Genome Atlas: piloting a decentralised approach to equitable biodiversity genomics.
ABSTRACT: A global genome database of all of Earth’s species diversity could be a treasure trove of scientific discoveries. However, regardless of the major advances in genome sequencing technologies, only a tiny fraction of species have genomic information available. To contribute to a more complete planetary genomic database, scientists and institutions across the world have united under the Earth BioGenome Project (EBP), which plans to sequence and assemble high-quality reference genomes for all ∼1.5 million recognized eukaryotic species through a stepwise phased approach. As the initiative transitions into Phase II, where 150,000 species are to be sequenced in just four years, worldwide participation in the project will be fundamental to success. As the European node of the EBP, the European Reference Genome Atlas (ERGA) seeks to implement a new decentralised, accessible, equitable and inclusive model for producing high-quality reference genomes, which will inform EBP as it scales. To embark on this mission, ERGA launched a Pilot Project to establish a network across Europe to develop and test the first infrastructure of its kind for the coordinated and distributed reference genome production on 98 European eukaryotic species from sample providers across 33 European countries. Here we outline the process and challenges faced during the development of a pilot infrastructure for the production of reference genome resources, and explore the effectiveness of this approach in terms of high-quality reference genome production, considering also equity and inclusion. The outcomes and lessons learned during this pilot provide a solid foundation for ERGA while offering key learnings to other transnational and national genomic resource projects.info:eu-repo/semantics/publishedVersio
Pervasive gaps in Amazonian ecological research
Biodiversity loss is one of the main challenges of our time,1,2 and attempts to address it require a clear un derstanding of how ecological communities respond to environmental change across time and space.3,4
While the increasing availability of global databases on ecological communities has advanced our knowledge
of biodiversity sensitivity to environmental changes,5–7 vast areas of the tropics remain understudied.8–11 In
the American tropics, Amazonia stands out as the world’s most diverse rainforest and the primary source of
Neotropical biodiversity,12 but it remains among the least known forests in America and is often underrepre sented in biodiversity databases.13–15 To worsen this situation, human-induced modifications16,17 may elim inate pieces of the Amazon’s biodiversity puzzle before we can use them to understand how ecological com munities are responding. To increase generalization and applicability of biodiversity knowledge,18,19 it is thus
crucial to reduce biases in ecological research, particularly in regions projected to face the most pronounced
environmental changes. We integrate ecological community metadata of 7,694 sampling sites for multiple or ganism groups in a machine learning model framework to map the research probability across the Brazilian
Amazonia, while identifying the region’s vulnerability to environmental change. 15%–18% of the most ne glected areas in ecological research are expected to experience severe climate or land use changes by
2050. This means that unless we take immediate action, we will not be able to establish their current status,
much less monitor how it is changing and what is being lostinfo:eu-repo/semantics/publishedVersio
ATLANTIC EPIPHYTES: a data set of vascular and non-vascular epiphyte plants and lichens from the Atlantic Forest
Epiphytes are hyper-diverse and one of the frequently undervalued life forms in plant surveys and biodiversity inventories. Epiphytes of the Atlantic Forest, one of the most endangered ecosystems in the world, have high endemism and radiated recently in the Pliocene. We aimed to (1) compile an extensive Atlantic Forest data set on vascular, non-vascular plants (including hemiepiphytes), and lichen epiphyte species occurrence and abundance; (2) describe the epiphyte distribution in the Atlantic Forest, in order to indicate future sampling efforts. Our work presents the first epiphyte data set with information on abundance and occurrence of epiphyte phorophyte species. All data compiled here come from three main sources provided by the authors: published sources (comprising peer-reviewed articles, books, and theses), unpublished data, and herbarium data. We compiled a data set composed of 2,095 species, from 89,270 holo/hemiepiphyte records, in the Atlantic Forest of Brazil, Argentina, Paraguay, and Uruguay, recorded from 1824 to early 2018. Most of the records were from qualitative data (occurrence only, 88%), well distributed throughout the Atlantic Forest. For quantitative records, the most common sampling method was individual trees (71%), followed by plot sampling (19%), and transect sampling (10%). Angiosperms (81%) were the most frequently registered group, and Bromeliaceae and Orchidaceae were the families with the greatest number of records (27,272 and 21,945, respectively). Ferns and Lycophytes presented fewer records than Angiosperms, and Polypodiaceae were the most recorded family, and more concentrated in the Southern and Southeastern regions. Data on non-vascular plants and lichens were scarce, with a few disjunct records concentrated in the Northeastern region of the Atlantic Forest. For all non-vascular plant records, Lejeuneaceae, a family of liverworts, was the most recorded family. We hope that our effort to organize scattered epiphyte data help advance the knowledge of epiphyte ecology, as well as our understanding of macroecological and biogeographical patterns in the Atlantic Forest. No copyright restrictions are associated with the data set. Please cite this Ecology Data Paper if the data are used in publication and teaching events. © 2019 The Authors. Ecology © 2019 The Ecological Society of Americ
Pervasive gaps in Amazonian ecological research
Biodiversity loss is one of the main challenges of our time,1,2 and attempts to address it require a clear understanding of how ecological communities respond to environmental change across time and space.3,4 While the increasing availability of global databases on ecological communities has advanced our knowledge of biodiversity sensitivity to environmental changes,5,6,7 vast areas of the tropics remain understudied.8,9,10,11 In the American tropics, Amazonia stands out as the world's most diverse rainforest and the primary source of Neotropical biodiversity,12 but it remains among the least known forests in America and is often underrepresented in biodiversity databases.13,14,15 To worsen this situation, human-induced modifications16,17 may eliminate pieces of the Amazon's biodiversity puzzle before we can use them to understand how ecological communities are responding. To increase generalization and applicability of biodiversity knowledge,18,19 it is thus crucial to reduce biases in ecological research, particularly in regions projected to face the most pronounced environmental changes. We integrate ecological community metadata of 7,694 sampling sites for multiple organism groups in a machine learning model framework to map the research probability across the Brazilian Amazonia, while identifying the region's vulnerability to environmental change. 15%–18% of the most neglected areas in ecological research are expected to experience severe climate or land use changes by 2050. This means that unless we take immediate action, we will not be able to establish their current status, much less monitor how it is changing and what is being lost
Omecamtiv mecarbil in chronic heart failure with reduced ejection fraction, GALACTIC‐HF: baseline characteristics and comparison with contemporary clinical trials
Aims:
The safety and efficacy of the novel selective cardiac myosin activator, omecamtiv mecarbil, in patients with heart failure with reduced ejection fraction (HFrEF) is tested in the Global Approach to Lowering Adverse Cardiac outcomes Through Improving Contractility in Heart Failure (GALACTIC‐HF) trial. Here we describe the baseline characteristics of participants in GALACTIC‐HF and how these compare with other contemporary trials.
Methods and Results:
Adults with established HFrEF, New York Heart Association functional class (NYHA) ≥ II, EF ≤35%, elevated natriuretic peptides and either current hospitalization for HF or history of hospitalization/ emergency department visit for HF within a year were randomized to either placebo or omecamtiv mecarbil (pharmacokinetic‐guided dosing: 25, 37.5 or 50 mg bid). 8256 patients [male (79%), non‐white (22%), mean age 65 years] were enrolled with a mean EF 27%, ischemic etiology in 54%, NYHA II 53% and III/IV 47%, and median NT‐proBNP 1971 pg/mL. HF therapies at baseline were among the most effectively employed in contemporary HF trials. GALACTIC‐HF randomized patients representative of recent HF registries and trials with substantial numbers of patients also having characteristics understudied in previous trials including more from North America (n = 1386), enrolled as inpatients (n = 2084), systolic blood pressure < 100 mmHg (n = 1127), estimated glomerular filtration rate < 30 mL/min/1.73 m2 (n = 528), and treated with sacubitril‐valsartan at baseline (n = 1594).
Conclusions:
GALACTIC‐HF enrolled a well‐treated, high‐risk population from both inpatient and outpatient settings, which will provide a definitive evaluation of the efficacy and safety of this novel therapy, as well as informing its potential future implementation
Mortality from gastrointestinal congenital anomalies at 264 hospitals in 74 low-income, middle-income, and high-income countries: a multicentre, international, prospective cohort study
Summary
Background Congenital anomalies are the fifth leading cause of mortality in children younger than 5 years globally.
Many gastrointestinal congenital anomalies are fatal without timely access to neonatal surgical care, but few studies
have been done on these conditions in low-income and middle-income countries (LMICs). We compared outcomes of
the seven most common gastrointestinal congenital anomalies in low-income, middle-income, and high-income
countries globally, and identified factors associated with mortality.
Methods We did a multicentre, international prospective cohort study of patients younger than 16 years, presenting to
hospital for the first time with oesophageal atresia, congenital diaphragmatic hernia, intestinal atresia, gastroschisis,
exomphalos, anorectal malformation, and Hirschsprung’s disease. Recruitment was of consecutive patients for a
minimum of 1 month between October, 2018, and April, 2019. We collected data on patient demographics, clinical
status, interventions, and outcomes using the REDCap platform. Patients were followed up for 30 days after primary
intervention, or 30 days after admission if they did not receive an intervention. The primary outcome was all-cause,
in-hospital mortality for all conditions combined and each condition individually, stratified by country income status.
We did a complete case analysis.
Findings We included 3849 patients with 3975 study conditions (560 with oesophageal atresia, 448 with congenital
diaphragmatic hernia, 681 with intestinal atresia, 453 with gastroschisis, 325 with exomphalos, 991 with anorectal
malformation, and 517 with Hirschsprung’s disease) from 264 hospitals (89 in high-income countries, 166 in middleincome
countries, and nine in low-income countries) in 74 countries. Of the 3849 patients, 2231 (58·0%) were male.
Median gestational age at birth was 38 weeks (IQR 36–39) and median bodyweight at presentation was 2·8 kg (2·3–3·3).
Mortality among all patients was 37 (39·8%) of 93 in low-income countries, 583 (20·4%) of 2860 in middle-income
countries, and 50 (5·6%) of 896 in high-income countries (p<0·0001 between all country income groups).
Gastroschisis had the greatest difference in mortality between country income strata (nine [90·0%] of ten in lowincome
countries, 97 [31·9%] of 304 in middle-income countries, and two [1·4%] of 139 in high-income countries;
p≤0·0001 between all country income groups). Factors significantly associated with higher mortality for all patients
combined included country income status (low-income vs high-income countries, risk ratio 2·78 [95% CI 1·88–4·11],
p<0·0001; middle-income vs high-income countries, 2·11 [1·59–2·79], p<0·0001), sepsis at presentation (1·20
[1·04–1·40], p=0·016), higher American Society of Anesthesiologists (ASA) score at primary intervention
(ASA 4–5 vs ASA 1–2, 1·82 [1·40–2·35], p<0·0001; ASA 3 vs ASA 1–2, 1·58, [1·30–1·92], p<0·0001]), surgical safety
checklist not used (1·39 [1·02–1·90], p=0·035), and ventilation or parenteral nutrition unavailable when needed
(ventilation 1·96, [1·41–2·71], p=0·0001; parenteral nutrition 1·35, [1·05–1·74], p=0·018). Administration of
parenteral nutrition (0·61, [0·47–0·79], p=0·0002) and use of a peripherally inserted central catheter (0·65
[0·50–0·86], p=0·0024) or percutaneous central line (0·69 [0·48–1·00], p=0·049) were associated with lower mortality.
Interpretation Unacceptable differences in mortality exist for gastrointestinal congenital anomalies between lowincome,
middle-income, and high-income countries. Improving access to quality neonatal surgical care in LMICs will
be vital to achieve Sustainable Development Goal 3.2 of ending preventable deaths in neonates and children younger
than 5 years by 2030